CageCavityCalc (C3): A Computational Tool for Calculating and Visualizing Cavities in Molecular Cages

Organic(porous) and metal–organic cages are promising biomimetic platforms with diverse applications spanning recognition, sensing, and catalysis. The key to the emergence of these functions is the presence of well-defined inner cavities capable of binding a wide range of guest molecules and modulating their properties. However, despite the myriad cage architectures currently available, the rational design of structurally diverse and functional cages with specific host–guest properties remains challenging. Efficiently predicting such properties is critical for accelerating the discovery of novel functional cages. Herein, we introduce CageCavityCalc (C3), a Python-based tool for calculating the cavity size of molecular cages. The code is available on GitHub at https://github.com/VicenteMartiCentelles/CageCavityCalc. C3 utilizes a novel algorithm that enables the rapid calculation of cavity sizes for a wide range of molecular structures and porous systems. Moreover, C3 facilitates easy visualization of the computed cavity size alongside hydrophobic and electrostatic potentials, providing insights into host–guest interactions within the cage. Furthermore, the calculated cavity can be visualized using widely available visualization software, such as PyMol, VMD, or ChimeraX. To enhance user accessibility, a PyMol plugin has been created, allowing nonspecialists to use this tool without requiring computer programming expertise. We anticipate that the deployment of this computational tool will significantly streamline cage cavity calculations, thereby accelerating the discovery of functional cages.


S1. Python module and command line interface
The module can be used from the command line or from a python file by loading the CageCavityCalc module.To use C3 from the command line the user needs to execute in the Anaconda Prompt the following commands: $CageCavityCalc -f cage.pdb-o cage_cavity.pdb-gr 1.5.This order will load the cage.pdbfile containing the cage chemical structure and the cavity of the cage will be calculated using a grid spacing of 1.5 Å.Additional arguments can be used as described in Table S1, allowing specifying the distance threshold used to calculate 90º angle, the use of the clustering algorithm to remove noisy cavity points that do not belong to the main cavity, calculation of hydrophobicity specifying the method and distance function, calculation of ESP, save a PyMol pml file, or print additional information of the calculations in the terminal, etc.. Table S1.Arguments that can be used in the C3 Python module though the command line.We also provide a more complex example to show additional functionality of C3.In the example below it is loaded the cage.pdbfile, then the cavity is computed using a grid spacing of 1.0 Å and a distance threshold for the 90-degree calculation of 2.0 times the window size.Note that this code uses the same implementation of the distance threshold for the 90-degree of the PyMol plugin.If computed window size is very small, resulting in threshold for the 90-degree smaller than 5 Å, the threshold is set to 5 Å to ensure the probe to find atoms to calculate the angle.The Python module can be integrated into more complex programs, for example, it can be used to read a cage class from Cgbind, that enables the cage construction from a ligand smile structure, a metal, and the cage topology: The obtained cavity volumes in all the structures of the trajectory can be saved using the following script.The number of atoms is kept constant by setting additional grid points is in existing grid point, or in case when there is no cavity it is set to (0,0,0), when necessary.

S2. PyMol plugin
The C3 Python module is integrated into PyMol in a plugin.The plugin is integrated into the software through a user interface allowing the selection of the different parameters for the cavity calculation.First, the user needs to initiate PyMol by typing "pymol" in the Anaconda Prompt.
Then, in the PyMol interface the user needs to load the desired cage file using File > Open and select the "cage.pdb"file.Then, to initiate the C3 plugin, the user needs access to Plugin > CageCavityCalc.All the options that the user can adjust are described in Figure S1.Once all the options are selected, the user needs to click on "Calculate volume" to initiate the calculation of the cavity and all the selected properties.Once the computation is finished, the computed cavity and the cavity with the properties are displayed in PyMol.The PyMol plugin enables the storage of all computed properties in the same PyMol session file, allowing the user to select which one to display and to save PDB files of each property.To save the session file, the user needs to access to File > Save Session As.The user can select the computed property to display by just clicking on the right panel of the generated cavity objects, i.e. click over the cavity object to toggle between hide/show.(see Figure 9 in the manuscript).To obtain a good quality image of the cage and the cavity, the user needs to type "ray" in the PyMol command line, then the obtained image can be saved by using File > Export Image As > PNG.

S3. Installation instructions
The installation of C3 requires the following steps.The software is compatible with Linux, Windows, and macOS.
First it is required to install Miniconda3 (Python 3.7 or later,), that can be obtained from https://docs.conda.io/en/latest/miniconda.html.Then in the "Anaconda Prompt", the command line version of C3 is installed using pip: "pip install CageCavityCalc", this will install the required dependencies.
As with any program, to run CageCavityCalc from the command line it is needed to either add its installation folder to the system path or to execute the CageCavityCalc.pyfile directly from the folder.For example, in Windows the user needs to add the folder "C:\Users\UserName\miniconda3\Lib\site-packages\CageCavityCalc\" to the Python path navigating through the following menus: My Computer > Properties > Advanced System Settings > Environment Variables > PYTHONPATH.
To install the C3 PyMol plugin, the open-source version of PyMol must be installed in the "Anaconda Prompt" the command using "conda install -c conda-forge pymol-open-source" for Windows, macOS, and Linux.Alternatively, it can be installed from https://www.cgohlke.com/for Windows.It is also required to install the following dependencies: "pip install pyqt5 qtpy" (in some cases it may require uninstall pyqt5 with "pip uninstall pyqt5" followed by "pip install pyqt5 qtpy").For Spanish computers, for running the plugin it is required to change the regional settings of the computer to use points as a decimal separator instead of commas.
Once PyMol is installed, in PyMol the plugin is installed from: Plugin > Plugin Manager > Install

S4. Examples of cavity calculation for cages C1-C16
In our cavity calculations we used the X-ray diffraction structure of the cages from the Cambridge Structural Database (CSD).The cage structures were prepared from the original CIF files from the CSD by removing non-cage molecules using Wavefunction Spartan '20.S1 Additionally, the outward-pointing phenyl groups of cage C9 were removed.
The cavity of cages C1-C16 were calculated using C3 and the optimized parameters described in Table S2.For parameter optimization, the grid spacing for small-medium cages C1-C12 (i.e.cavity volumes less than 1000 Å 3 ) to 0.5 Å.The larger cages C13-C16 required increasing the grid spacing to 1.0-3.5 Å to run in a reasonable time and to adjust to hardware limitations.For each cage reported in Table S2, the overall calculation time takes from seconds to minutes (typically times vary from 30 seconds to 5 minutes) depending on grid resolution and cavity size in a PC computer with an Intel(R) Core(TM) i9-10900K CPU @ 3.70GHz processor.

S6. Evaluation of the Grid-Spacing Sensitivity (GSS)
The Grid-Spacing Sensitivity (GSS) has been evaluated by increasing the grid-spacing from the parameters described in Table S2 for cages C1-C16.

S7. Evaluation of the Mouth Opening Ambiguity (MOA)
The Mouth Opening Ambiguity (MOA) has been evaluated by using distance threshold for the 90degree calculation (dt90) values from 1 to 5 in window size units, and the parameters described in Table S2.Note that if computed window size is very small, resulting in threshold for the 90-degree smaller than 5 Å, the threshold is set to 5 Å to ensure the probe to find atoms to calculate the angle.
The cavity volume obtained for cages with small windows (C1-C5, C8, and C11) does not show any dependency on the distance threshold for the 90-degree calculation, and therefore it is not required any optimization by the user.Cages with larger windows have variable behaviors depending on the structure of each cage.Cages C9, C11, and C12 do not show any dependency on dt90, cage C6 shows a 2% variation of the calculated volume from dt90 1.0 to 2.5; if dt90 is more than 3.0, variations of the calculated volume up to -72% are observed.In contrast, the similarly shaped cage C7 shows a 2% variation of the calculated volume over the whole test range dt90 1.0 to 5.0.For cages with larger openings (C13-16), it is required to analyze visually the computed cavity and adjust the distance threshold for the 90-degree calculation.The obtained results are presented in Figure S3.

Figure S1 .
Figure S1.Screenshot of the C3 GUI of the PyMol plugin.
To use C3 as in a Python script, it is required to load the module, followed by the initialization of the cavity, load the .pdbfile of the cage, followed by the cavity volume calculation (using the default values of grid spacing resolution 1 Å and distance threshold for the 90-degree calculation of 5 Å) and saving the corresponding *.pdb file and PyMol *.pml file for cavity visualization in PyMol.

S8. Systematic comparison with other cavity calculation software
Chem.Inf.Model.2023,63,3772.S2The parameters used for each software are presented in the following tables.All parameters not included in the tables were kept at their default values.Note that the program PyWindow has no customizable parameters, therefore there is no need to adjust any parameter.

Table S8 .
Ghecom detection parameters (Web version, https://pdbj.org/ghecom/,accessed20/4/24).For POVME, the center of the inclusion region was set to center of mass.The radius was chosen visually to best represent the cavity.